Low complexity affine merge mode for versatile video coding

ABSTRACT

In some aspects, the disclosure is directed to methods and systems for reducing memory utilization and increasing efficiency during affine merge mode for versatile video coding by utilizing motion vectors stored in a motion data line buffer for a prediction unit of a second coding tree unit neighboring a first coding tree unit to derive control point motion vectors for the first coding tree unit.

RELATED APPLICATIONS

The present application claims the benefit of and priority as acontinuation to U.S. Nonprovisional application Ser. No. 16/453,672,entitled “Low Complexity Affine Merge Mode for Versatile Video Coding,”filed Jun. 26, 2019; which claims priority to U.S. ProvisionalApplication No. 62/690,583, entitled “Low Complexity Affine Merge Modefor Versatile Video Coding,” filed Jun. 27, 2018; and U.S. ProvisionalApplication No. 62/694,643, entitled “Low Complexity Affine Merge Modefor Versatile Video Coding,” filed Jul. 6, 2018; and U.S. ProvisionalApplication No. 62/724,464, entitled “Low Complexity Affine Merge Modefor Versatile Video Coding,” filed Aug. 29, 2018, the entirety of eachof which is incorporated by reference herein.

FIELD OF THE DISCLOSURE

This disclosure generally relates to systems and methods for videoencoding and compression. In particular, this disclosure relates tosystems and methods for low complexity affine merge mode for versatilevideo coding.

BACKGROUND OF THE DISCLOSURE

Video coding or compression standards allow for digital transmission ofvideo over a network, reducing the bandwidth required to transmit highresolution frames of video to a fraction of its original size. Thesestandards may be lossy or lossless, and incorporate inter- andintra-frame compression, with constant or variable bit rates.

BRIEF DESCRIPTION OF THE DRAWINGS

Various objects, aspects, features, and advantages of the disclosurewill become more apparent and better understood by referring to thedetailed description taken in conjunction with the accompanyingdrawings, in which like reference characters identify correspondingelements throughout. In the drawings, like reference numbers generallyindicate identical, functionally similar, and/or structurally similarelements.

FIG. 1A is an illustration of an example of dividing a picture intocoding tree units and coding units, according to some implementations;

FIG. 1B is an illustration of different splits of coding tree units,according to some implementations;

FIG. 2 is a block diagram of a versatile video coding (VVC) decoder,according to some implementations;

FIG. 3 is an illustration of motion data candidate positions for mergingcandidate list derivations, according to some implementations;

FIG. 4A is an illustration of an affine motion model, according to someimplementations;

FIG. 4B is an illustration of a restricted affine motion model,according to some implementations;

FIG. 4C is an illustration of an expanded affine motion model, accordingto some implementations;

FIG. 5 is an illustration of motion data candidate positions for anaffine merge mode, according to some implementations;

FIG. 6A is an illustration of inheriting affine motion data fromneighbors in a 4-parameter affine motion model, according to someimplementations;

FIG. 6B is an illustration of inheriting affine motion data fromneighbors in a 6-parameter affine motion model, according to someimplementations;

FIG. 6C is an illustration of line buffer storage for affine merge modein a 4-parameter affine motion model, according to some implementations;

FIG. 7A is an illustration of a shared motion data line buffer foraffine merge and non-affine merge/skip mode, for a 4-parameter affinemotion model, according to some implementations;

FIG. 7B is an illustration of a shared motion data line buffer foraffine merge and non-affine merge/skip mode, for a 6-parameter affinemotion model, according to some implementations;

FIG. 8A is an illustration of parallel derivation of control pointvectors and sub-block vectors for a current prediction unit, accordingto some implementations;

FIG. 8B is an illustration of a shared motion data line buffer foraffine merge and non-affine merge/skip mode, for a 6-parameter affinemotion model, modified from the implementation of FIG. 7B;

FIG. 9A is an illustration of a shared motion data line buffer foraffine merge and non-affine merge/skip mode, for an adaptive affinemotion model, according to some implementations;

FIG. 9B is an illustration of a shared motion data line buffer foraffine merge and non-affine merge/skip mode, with control point vectorsstored in a regular motion data line buffer, according to someimplementations;

FIG. 9C is an illustration of a shared motion data line buffer foraffine merge and non-affine merge/skip utilizing a local motion databuffer, according to some implementations;

FIG. 10 is a flow chart of a method for decoding video via an adaptiveaffine motion model, according to some implementations;

FIG. 11A is a block diagram depicting an embodiment of a networkenvironment; and

FIGS. 11B and 11C are block diagrams depicting embodiments of computingdevices useful in connection with the methods and systems describedherein.

The details of various embodiments of the methods and systems are setforth in the accompanying drawings and the description below.

DETAILED DESCRIPTION

The following video compression standard(s), including any draftversions of such standard(s), are hereby incorporated herein byreference in their entirety and are made part of the present disclosurefor all purposes: MPEG VVC; ITU-T H.266. Although this disclosure mayreference aspects of these standard(s), the disclosure is in no waylimited by these standard(s).

For purposes of reading the description of the various embodimentsbelow, the following descriptions of the sections of the specificationand their respective contents may be helpful:

-   -   Section A describes embodiments of systems and methods for        versatile video coding; and    -   Section B describes a network environment and computing        environment which may be useful for practicing embodiments        described herein.        A. Low Complexity Affine Merge Mode for Versatile Video Coding

VVC (Versatile Video Coding) video compression employs a flexible blockcoding structure to achieve higher compression efficiency. As shown inFIG. 1A, in VVC a picture 100 is divided into coding tree units (CTUs)102. In some implementations, a CTU can be up to 128×128 pixels in size.A CTU 102 is made up of one or more coding units (CUs) 104 which may beof the same or different sizes, as shown. In some implementations, CUsmay be generated by using recursive splits of larger CUs or CTUs. Asshown in FIG. 1B, in some implementations, a quad-tree plus binary andtriple tree (QTBTT) recursive block partitioning structure is used todivide a CTU 102 into CUs 104. In some implementations, a CU 104 canhave four-way split by using quad-tree partitioning (e.g. QT split atleft); two-way split by using horizontal or vertical binary treepartitioning (e.g. horizontal BT split and vertical BT split, at topcenter and top right); or a three-way split by using horizontal orvertical triple tree partitioning (e.g. horizontal TT split and verticalTT split, at bottom center and bottom right). A CU 104 can be as largeas a CTU 102 (e.g. having no splits), and as small as a 4×4 pixel block.

In many implementations of VVC, there is no concept of splitting a CU104 into prediction units (PUs) and Transform Units (TUs) at the CUlevel, as in some implementations of high efficiency video coding(HEVC). In some implementations, a CU 104 is also a PU and a TU, exceptfor implementations in which the CU size may be larger than the maximumTU size allowed (e.g. the CU size is 128×128 pixels, but the maximum TUsize is 64×64 pixels), in which case a CU 104 is forced to split intomultiple PUs and/or TUs. Additionally, there are occasions where the TUsize is smaller than the CU size, namely in Intra Sub-Partitioning (ISP)and Sub-Block Transforms (SBT). Intra sub-partitioning (ISP) splits anintra-CU, either vertically or horizontally, into 2 or 4 TUs (for lumaonly, chroma CU is not split). Similarly, sub-block transforms (SBT)split an inter-CU into either 2 or 4 TUs, and only one of these TUs isallowed to have non-zero coefficients. Within a CTU 102, some CUs 104can be intra-coded, while others can be inter-coded. Such a blockstructure offers coding flexibility of using different CU/PU/TU sizesbased on characteristics of incoming content, especially the ability ofusing large block size tools (e.g., large prediction unit size up to128×128 pixels, large transform and quantization size up to 64×64pixels), providing significant coding gains when compared to MPEG/ITU-THEVC/H.265 coding.

FIG. 2 is a block diagram of a versatile video coding (VVC) decoder,according to some implementations. Additional steps may be included insome implementations, as described in more detail herein, including aninverse quantization step after the context-adaptive binary arithmeticcoding (CABAC) decoder; an inter- or intra-prediction step (based onintra/inter modes signaled in the bitstream); an inversequantization/transform step; a sample adaptive offset (SAO) filter stepafter the de-blocking filter; an advanced motion vector predictor (AMVP)candidate list derivation step for reconstructing motion data of theAMVP mode by adding the predictors to the MVDs (Motion VectorDifferences) provided by the CABAC decoder; a merge/skip candidate listderivation step for reconstructing motion data of the merge/skip mode byselecting motion vectors from the list based on the merge index providedby the CABAC decoder; and a decoder motion data enhancement (DME) stepproviding refined motion data for inter-frame prediction.

In many implementations, VVC employs block-based intra/inter prediction,transform and quantization and entropy coding to achieve its compressiongoals. Still referring to FIG. 2 and in more detail, the VVC decoderemploys CABAC for entropy coding. The CABAC engine decodes the incomingbitstream and delivers the decoded symbols including quantized transformcoefficients and control information such as intra prediction modes,inter prediction modes, motion vector differences (MVDs), merge indices(merge_idx), quantization scales and in-loop filter parameters. Thequantized transform coefficients may be processed via inversequantization and an inverse transform to reconstruct the predictionresidual blocks for a CU 104. Based on signaled intra- or inter-frameprediction modes, a decoder performs either intra-frame prediction orinter-frame prediction (including motion compensation) to produce theprediction blocks for the CU 104; the prediction residual blocks areadded back to the prediction blocks to generate the reconstructed blocksfor the CU 104. In-loop filtering, such as a bilateral filter,de-blocking filter, SAO filter, de-noising filter, adaptive loop filter(ALF) and Neural Network based in-loop filters, may be performed on thereconstructed blocks to generate the reconstructed CU 104 (e.g. afterin-loop filtering) which is stored in the decoded picture buffer (DPB).In some implementations, one or more of the bilateral filter, de-noisingfilter, and/or Neural Network based in-loop filters may be omitted orremoved. For hardware and embedded software decoder implementations, theDPB may be allocated on off-chip memory due to the reference picturedata size.

For an inter-coded CU 104 (a CU 104 using inter-prediction modes), insome implementations, two modes may be used to signal motion data in thebitstream. If the motion data (motion vectors, prediction direction(list 0 and/or list 1), reference index (indices)) of an inter-coded PUis inherited from spatial or temporal neighbors of the current PU,either in merge mode or in skip mode, only the merge index (merge_idx)may be signaled for the PU; the actual motion data used for motioncompensation can be derived by constructing a merging candidate list andthen addressing it by using the merge_idx. If an inter-coded CU 104 isnot using merge/skip mode, the associated motion data may bereconstructed on the decoder side by adding the decoded motion vectordifferences to the AMVPs (advanced motion vector predictors). Both themerging candidate list and AMVPs of a PU can be derived by using spatialand temporal motion data neighbors.

In many implementations, merge/skip mode allows an inter-predicted PU toinherit the same motion vector(s), prediction direction, and referencepicture(s) from an inter-predicted PU which contains a motion dataposition selected from a group of spatially neighboring motion datapositions and one of two temporally co-located motion data positions.FIG. 3 is an illustration of candidate motion data positions for amerge/skip mode, according to some implementations. For the current PU,a merging candidate list may be formed by considering merging candidatesfrom one or more of the seven motion data positions depicted: fivespatially neighboring motion data positions (e.g. a bottom leftneighboring motion data position A1, an upper neighboring motion dataposition B1, an upper right neighboring motion data position B0, and adown left neighboring motion data position A0, an top-left neighboringmotion data position B2, a motion data position H bottom-right to thetemporally co-located PU, and a motion data position CR inside thetemporally co-located PU). To derive motion data from a motion dataposition, the motion data is copied from the corresponding PU whichcontains (or covers) the motion data position.

The spatial merging candidates, if available, may be ordered in theorder of A1, B1, B0, A0 and B2 in the merging candidate list. Forexample, the merging candidate at position B2 may be discarded if themerging candidates at positions A1, B1, B0 and A0 are all available. Aspatial motion data position is treated as unavailable for the mergingcandidate list derivation if the corresponding PU containing the motiondata position is intra-coded, belongs to a different slice from thecurrent PU, or is outside the picture boundaries.

To choose the co-located temporal merging candidate (TMVP), theco-located temporal motion data from the bottom-right motion dataposition (e.g., (H) in FIG. 3 , outside the co-located PU) is firstchecked and selected for the temporal merging candidate if available.Otherwise, the co-located temporal motion data at the central motiondata position (e.g., (CR) in FIG. 3 ) is checked and selected for thetemporal merging candidate if available. The temporal merging candidateis placed in the merging candidate list after the spatial mergingcandidates. A temporal motion data position (TMDP) is treated asunavailable if the corresponding PU containing the temporal motion dataposition in the co-located reference picture is intra-coded or outsidethe picture boundaries.

After adding available spatial and temporal neighboring motion data tothe merging list, the list can be appended with the historical mergingcandidates, average and/or zero candidates until the merging candidatelist size reaches a pre-defined or dynamically set maximum size (e.g. 6candidates, in some implementations).

Due to referencing to motion data from the top spatial neighboring PUs(e.g. B0-B2) in the merge/skip and AMVP candidate list derivation, andCTUs are processed in raster scan order, a motion data line buffer isneeded to store spatial neighboring motion data for those neighboringPUs located at the top CTU boundary.

Affine motion compensation prediction introduces a more complex motionmodel for better compression efficiency. In some coding implementations,only a translational motion model is considered in which all the samplepositions inside a PU may have a same translational motion vector formotion compensated prediction. However, in the real world, there aremany kinds of motion, e.g. zoom in/out, rotation, perspective motionsand other irregular motions. The affine motion model described hereinsupports different motion vectors at different sample positions inside aPU, which effectively captures more complex motion. As shown in FIG. 4A,illustrating an implementation of an affine motion model, differentsample positions inside a PU, such as four corner points of the PU, mayhave different motion vectors ({right arrow over (v₀)} through {rightarrow over (v₃)}) as supported by the affine mode. In FIG. 4A, theorigin (0,0) of the x-y coordinate system is at the top-left cornerpoint of a picture.

A PU coded in affine mode and affine merge mode may have uni-prediction(list 0 or list 1 prediction) or bi-directional prediction (i.e. list 0and list 1 bi-prediction). If a PU is coded in bi-directional affine orbi-directional affine merge mode, the process of affine mode and affinemerge mode described hereafter is performed separately for list 0 andlist 1 predictions.

In the affine motion model, the motion vector {right arrow over(v)}=(v_(x),v_(y)) at a sample position (x,y) inside a PU is defined asfollows:

$\begin{matrix}\{ \begin{matrix}{v_{x} = {{ax} + {cy} + e}} \\{v_{y} = {{bx} + {dy} + f}}\end{matrix}  & {{Equation}\mspace{14mu} 1}\end{matrix}$where a, b, c, d, e, f are the affine motion model parameters, whichdefine a 6-parameter affine motion model.

A restricted affine motion model, e.g., a 4-parameter model, can bedescribed with the four parameters by restricting a=d and b=−c inEquation 1:

$\begin{matrix}\{ \begin{matrix}{v_{x} = {{ax} - {by} + e}} \\{v_{y} = {{bx} + {ay} + f}}\end{matrix}  & {{Equation}\mspace{14mu} 2}\end{matrix}$In the 4-parameter affine motion model proposed to the VVC, the modelparameters a, b, e, f are determined by signaling two control point(seed) vectors at the top-left and top-right corner of a PU. As shown inFIG. 4B, with two control point vectors {right arrow over (v₀)}=(v_(0x),v_(0y)) at sample position (x₀, y₀) and {right arrow over(v₁)}=(v_(1x),v_(1y)) at sample position (x₁,y₁), Equation 2 can berewritten as:

$\begin{matrix}\{ \begin{matrix}{v_{x} = {{\frac{( {v_{1\; x} - v_{0\; x}} )}{( {x_{1} - x_{0}} )}( {x - x_{0}} )} - {\frac{( {v_{1\; y} - v_{0\; y}} )}{( {x_{1} - x_{0}} )}( {y - y_{0}} )} + v_{0\; x}}} \\{v_{y} = {{\frac{( {v_{1\; y} - v_{0\; y}} )}{( {x_{1} - x_{0}} )}( {x - x_{0}} )} + {\frac{( {v_{1\; x} - v_{0\; x}} )}{( {x_{1} - x_{0}} )}( {y - y_{0}} )} + v_{0\; y}}}\end{matrix}  & {{Equation}\mspace{14mu} 3}\end{matrix}$One such implementation is illustrated in FIG. 4B, in which (x₁−x₀)equals the PU width and y₁=y₀. In fact, to derive the parameters of the4-parameter affine motion model, the two control point vectors do nothave to be at the top-left and top-right corner of a PU as proposed insome methods. As long as the two control points have x₁≠x₀ and y₁=y₀,Equation 3 is valid.

Likewise, for the 6-parameter affine motion model for someimplementations of VVC, the model parameters a, b, c, d, e, f aredetermined by signaling three control point vectors at the top-left,top-right and bottom-left corner of a PU. As shown in FIG. 4C, withthree control point vectors {right arrow over (v₀)}=(v_(0x),v_(0y)) atsample position (x₀,y₀), {right arrow over (v₁)}=(v_(1x),v_(1y)) atsample position (x₁,y₁) and {right arrow over (v₂)}=(v_(2x),v_(2y)) atsample position (x₂,y₂), Equation 1 can be rewritten as:

$\begin{matrix}\{ \begin{matrix}{v_{x} = {{\frac{( {v_{1x} - v_{0x}} )}{( {x_{1} - x_{0}} )}( {x - x_{0}} )} + {\frac{( {v_{2x} - v_{0x}} )}{( {y_{2} - y_{0}} )}( {y - y_{0}} )} + v_{0x}}} \\{v_{y} = {{\frac{( {v_{1y} - v_{0y}} )}{( {x_{1} - x_{0}} )}( {x - x_{0}} )} + {\frac{( {v_{2y} - v_{0y}} )}{( {y_{2} - y_{0}} )}( {y - y_{0}} )} + v_{0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 4}\end{matrix}$Note that in FIG. 4C, (x₁−x₀) equals the PU width, (y₂−y₀) equals the PUheight, y₁=y₀ and x₂=x₀. To derive the parameters of the 6-parameteraffine motion model, the three control point vectors do not have to beat the top-left, top-right and bottom-left corner of a PU as shown inFIG. 4C. As long as the three control points stratify x₁≠x₀, y₂≠y₀,y₁=y₀ and x₂=x₀, Equation 4 is valid.

To constrain the memory bandwidth consumption of the affine mode formotion compensation, the motion vectors of a PU coded in affine mode arenot derived for each sample in a PU. As shown in FIGS. 4B and 4C, all ofthe samples inside a sub-block (e.g. 4×4 block size) of the PU share asame motion vector, which is derived at sample position (x,y). Thesample position may be chosen for the sub-block and by using Equation 3or Equation 4 (depending on the type of affine motion model). The sampleposition (x,y) selected for the sub-block can be any position within thesub-block, such as the top-left corner or the middle point of thesub-block. This process may be referred to as the sub-block motion dataderivation process of the affine mode. Note that the same sub-blockmotion data (i.e. sub-block motion vectors, prediction direction(list0/list1 unidirectional prediction or list0 and list1 bidirectionalprediction) and reference indices) may be used by the motioncompensation of the current PU coded in affine mode used as spatialneighboring motion data in the merge/skip, (affine) AMVP list derivationof the adjacent PUs, and stored as temporal motion data (TMVPs) for usewith future pictures (see FIG. 3 ).

In the proposed affine mode, the control point vectors aredifferentially coded by taking difference relative to the control pointmotion vector predictors (CPMVPs), which are derived by using theneighboring spatial and temporal motion data of the PU.

To further improve the compression efficiency, an affine merge mode maybe utilized in some implementations of VVC. Similar to the regularmerge/skip mode described above, a PU can also inherit affine motiondata from neighbors in the affine merge mode without explicitlysignaling the control point vectors. As shown FIG. 5 , in someimplementations of an affine merge mode, a PU searches through the fivespatial neighbors in the order of A, B, C, D and E (or in other orders,in some implementations), and inherits the affine motion data from thefirst neighboring block using the affine mode (first referring here to afirst selected block rather than block A necessarily), or from themultiple neighboring blocks using the affine mode. A more complex affinemerge mode may also use temporal neighbors, as in the regular mergemode.

FIG. 6A illustrates how affine motion data may be inherited from aspatial neighbor in the case of the 4-parameter affine motion modeldiscussed above, according to some implementations. In this example,block E is assumed to be the selected or first neighboring block usingthe affine mode from which the affine motion data is inherited. Thecontrol point vectors for the current PU, i.e. {right arrow over(v₀)}=(v_(0x),v_(0y)) at the top-left corner position (x₀,y₀) and {rightarrow over (v₁)}=(v_(1x),v_(1y)) at the top-right corner position(x₁,y₁), are derived by using the control point vectors {right arrowover (v_(E0))}=(v_(E0x),v_(E0y)) at the top-left sample position(x_(E0),y_(E0)), and {right arrow over (v_(E1))}=(v_(E1x),v_(E1y)) atthe top-right sample position (x_(E1),y_(E1)) of the neighboring PUcontaining block E, and using Equation 3:

$\begin{matrix}\{ \begin{matrix}{v_{0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} = x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} - {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{0y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 5} \\\{ \begin{matrix}{v_{1x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} - {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{1\; y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 6}\end{matrix}$As shown in Equations 5 and 6, to derive the control point vectors forthe current PU, not only the control point vectors but also the PU sizeof the neighboring PU coded in the affine mode may be utilized, as(x_(E1)−x_(E0)) and (x₀−x_(E0)) are the PU width and height of theneighboring PU, respectively.

Similarly, for the example of the 6-parameter affine motion model shownin in FIG. 6B, the control point vectors for the current PU, i.e. {rightarrow over (v₀)}=(v_(0x),v_(0y)) at the top-left corner position(x₀,y₀), {right arrow over (v₁)}=(v_(1x),v_(1y)) at the top-right cornerposition (x₁,y₁) and {right arrow over (v₂)}=(v_(2x),v_(2y)) at thebottom-left corner position (x₂,y₂), are derived by using the controlpoint vectors {right arrow over (v_(E0))}=(v_(E0x),v_(E0y)) at thetop-left sample position (x_(E0),y_(E0)), {right arrow over(v_(E1))}=(v_(E1x),v_(E1y)) at the top-right sample position(x_(E1),y_(E1)), and {right arrow over (v_(E2))}=(v_(E2x),v_(E2y)) atthe bottom-left sample position (x_(E2),y_(E2)) of the neighboring PUcontaining block E, and using Equation 4:

$\begin{matrix}\{ \begin{matrix}\begin{matrix}{v_{0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} +}} \\v_{E\; 0x}\end{matrix} \\\begin{matrix}{v_{{0y}\;} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} +}} \\v_{E\; 0y}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 7} \\\{ \begin{matrix}\begin{matrix}{v_{1x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} +}} \\v_{E\; 0x}\end{matrix} \\\begin{matrix}{v_{1\; y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} +}} \\v_{E\; 0y}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 8} \\\{ \begin{matrix}\begin{matrix}{v_{2x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{2} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{2} - y_{E\; 0}} )} +}} \\v_{E\; 0x}\end{matrix} \\\begin{matrix}{v_{2y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{2} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{2} - y_{E\; 0}} )} +}} \\v_{E\; 0\; y}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 9}\end{matrix}$

In some implementations, the current PU and the neighboring PU may usedifferent types of affine motion models. For example, if the current PUuses the 4-parameter model but a neighboring PU (e.g. E) uses the6-parameter model, then Equation 7 and Equation 8 can be used forderiving the two control point vectors for the current PU. Similarly, ifthe current PU uses the 6-parameter model but a neighboring PU (e.g. E)uses the 4-parameter model, then Equation 5, Equation 6 and Equation 10can be used for deriving the three control point vectors for the currentPU.

$\begin{matrix}\{ \begin{matrix}\begin{matrix}{v_{2x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{2} - x_{E\; 0}} )} - {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{2} - y_{E\; 0}} )} +}} \\v_{E\; 0x}\end{matrix} \\\begin{matrix}{v_{2y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{2} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{2} - y_{E\; 0}} )} +}} \\v_{E\; 0y}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 10}\end{matrix}$

In some implementations, even if the neighboring PU uses the 4-parametermodel, the control point vector {right arrow over(v_(E2))}=(v_(E2x),v_(E2y)) at the bottom-left sample position(x_(E2),y_(E2)) of the neighboring PU containing block E may be derivedusing Equation 11 first, then Equation 7, Equation 8 (and Equation 9 ifthe current PU uses the 6-parameter model). Accordingly, the system mayallow derivation of the control point vectors of the current PU,regardless of whether the current PU uses the 4- or 6-parameter model.

$\begin{matrix}\{ \begin{matrix}\begin{matrix}{v_{E2x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{E2} - x_{E\; 0}} )} - \frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}}} \\{( {y_{E2} - y_{E\; 0}} ) + v_{E\; 0x}}\end{matrix} \\\begin{matrix}{v_{E2y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{E\; 2} - x_{E\; 0}} )} - \frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}}} \\{( {y_{E2} - y_{E\; 0}} ) + v_{E\; 0y}}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 11}\end{matrix}$

In some implementations, to support the affine merge mode, both PU sizesand control point vectors of neighboring PUs may be stored in a bufferor other memory structure. As a picture is divided into CTUs and codedCTU by CTU in raster scan order, an additional line buffer, i.e. anaffine motion data line buffer, may be utilized for storage of thecontrol point vectors and PU sizes of the top neighboring blocks alongthe CTU boundary. In FIG. 6C, for example, the neighboring PUs of theprevious CTU row containing block E, B and C use the affine mode; tosupport the affine merge mode of the current CTU, the affine motion datainformation of those neighboring PUs, which include the control pointmotion vectors, prediction direction (list 0 and/or list 1), referenceindices of the control point vectors and the PU sizes (or samplepositions of control point vectors), may be stored in the affine motiondata line buffer.

Compared to motion data line buffers used for non-affine (regular)merge/skip candidate lists (for merge/skip mode) and AMVP candidate listderivation (for motion vector coding), the size of the affine motiondata line buffer is significant. For example, if the minimum PU size is4×4 and the maximum PU size is 128×128, in a non-affine motion data linebuffer, a motion vector (e.g. 4 bytes) and an associated referencepicture index (e.g. 4 bits) per prediction list (list 0 and list 1) arestored for every four horizontal samples. However, in someimplementations of an affine motion data line buffer, two or threecontrol point vectors (e.g. 8 or 12 bytes depending on the affine motionmodel used) and an associated reference picture index (e.g. 4 bits) perprediction list (list 0 and list 1), and PU width and height (e.g. 5+5bits) are stored for every N horizontal samples (e.g. N=8, N is theminimum PU width of PUs allowed for using affine mode). For 4K videowith horizontal picture size of 4096 luminance samples, the size of thenon-affine motion data line buffer is approximately 9,216 bytes (i.e.4096*(4+0.5)*2/4); the size of the affine motion data line buffer willbe 9,344 bytes (i.e. 4096*(8+0.5)*2/8+4096*10/8/8)) for the 4-parameteraffine motion model and 13,440 bytes (i.e.4096*(12+0.5)*2/8+4096*10/8/8)) for the 6-parameter affine motion model,respectively.

To reduce the memory footprint of the affine motion data line buffer, insome implementations, the non-affine or regular motion data line buffermay be re-used for the affine merge mode. FIG. 7A depicts animplementation of a 4-parameter affine motion model with a re-usedmotion data line buffer. In some implementations, the positions ofcontrol point motion vectors of a PU coded in affine mode or affinemerge mode are unchanged, e.g., still at the top-left and top-rightcorner position of the PU. If a PU is coded in affine mode in whichcontrol point motion vectors are explicitly signaled, the control pointmotion vectors at the top-left and top-right corner position of the PUmay be coded into the bitstream. If a PU is coded in affine merge modein which control point motion vectors are inherited from neighbors, thecontrol point motion vectors at the top-left and top-right cornerposition of the PU are derived by using control point vectors andpositions of the selected neighboring PU.

However, if the selected neighboring PU is located at the top CTUboundary, the motion vectors stored in the regular motion data linebuffer rather than the control point motion vectors of the selected PUmay be used for derivation of the control point motion vectors of thecurrent PU of the affine merge mode. For example, in FIG. 7A, if thecurrent PU uses the affine merge mode and inherits the affine motiondata from the neighboring PU E located at top CTU boundary, then motionvectors in the regular motion data line buffer, e.g., {right arrow over(v_(LE0))}=(v_(LE0x),v_(LE0y)) at sample position (x_(LE0),y_(LE0)) and{right arrow over (v_(LE1))}=(v_(LE1x),v_(LE1y)) at the sample position(x_(LE1),y_(LE1)) with y_(LE1)=y_(LE0), instead of affine control pointmotion vectors of the neighboring PU E, i.e. {right arrow over(v_(E0))}=(v_(E0x),v_(E0y)) at the top-left sample position(x_(E0),y_(E0)) and {right arrow over (v_(E1))}=(v_(E1x),v_(E1y)) at thetop-right sample position (x_(E1),y_(E1)), are used for the derivationof control point vectors {right arrow over (v₀)} and {right arrow over(v₁)} of the current PU.

In this case, motion vectors {right arrow over (v_(LE0))} and {rightarrow over (v_(LE1))} used for motion compensation of the bottom-leftand bottom-right sub-blocks of PU E are calculated by using the4-parameter affine motion mode, and by:

$\begin{matrix}\{ \begin{matrix}\begin{matrix}{v_{{LE}\; 0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 0} - x_{E\; 0}} )} - \frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}}} \\{( {y_{{LE}\; 0} - y_{E\; 0}} ) + v_{E\; 0x}}\end{matrix} \\\begin{matrix}{v_{{LE}\; 0y} = {{\frac{( {V_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 0} - x_{E\; 0}} )} + \frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}}} \\{( {y_{{LE}\; 0} - y_{E\; 0}} ) + v_{E\; 0y}}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 12} \\\{ \begin{matrix}\begin{matrix}{v_{{LE}\; 1x} = {{\frac{( {V_{E\; 1x} - V_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 1} - x_{E\; 0}} )} - \frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}}} \\{( {y_{{LE}\; 1} - y_{E\; 0}} ) + v_{E\; 0x}}\end{matrix} \\\begin{matrix}{v_{{LE}\; 1y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 1} - x_{E\; 0}} )} - \frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}}} \\{( {y_{{LE}\; 1} - y_{E\; 0}} ) + v_{E\; 0y}}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 13}\end{matrix}$

The control point vectors {right arrow over (v₀)} and {right arrow over(v₁)} of the current PU coded in affine merge mode are derived by

$\begin{matrix}\{ \begin{matrix}\begin{matrix}{v_{0x} = {{\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{0} - x_{{LE}\; 0}} )} - \frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}}} \\{( {y_{0} - y_{{LE}\; 0}} ) + v_{{LE}\; 0x}}\end{matrix} \\\begin{matrix}{v_{0y} = {{\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{0} - x_{{LE}\; 0}} )} + \frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}}} \\{( {y_{0} - y_{{LE}\; 0}} ) + v_{{LE}\; 0y}}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 14} \\\{ \begin{matrix}\begin{matrix}{v_{1x} = {{\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{1} - x_{{LE}\; 0}} )} - \frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}}} \\{( {y_{1} - y_{{LE}\; 0}} ) + v_{{LE}\; 0x}}\end{matrix} \\\begin{matrix}{v_{1y} = {{\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{1} - x_{{LE}\; 0}} )} + \frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}}} \\{( {y_{1} - y_{{LE}\; 0}} ) + v_{{LE}\; 0y}}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 15}\end{matrix}$

If the selected neighboring PU is not located at the top CTU boundary,e.g. located to the left side of the current PU or located inside thecurrent CTU, then the control point vectors {right arrow over (v₀)} and{right arrow over (v₁)} of the current PU are derived by directly usingthe control point vectors of the selected neighboring PU.

For example, if PU D in FIG. 7A is the selected neighboring PU for thecurrent PU coded in affine merge mode, then the control point vectors{right arrow over (v₀)} and {right arrow over (v₁)} of the current PUare derived by directly using the neighboring control point vectors ofthe neighboring PU D, i.e. {right arrow over (v_(D0))}=(v_(D0x),v_(D0y))at the top-left sample position (x_(D0),y_(D0)) and {right arrow over(v_(D1))}=(v_(D1x),v_(D1y)) at the top-right sample position(x_(D1),y_(D1)), and by

$\begin{matrix}\{ \begin{matrix}{v_{0x} = {{\frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{{LD}\; 0}} )}( {x_{0} - x_{D\; 0}} )} - \frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}}} \\\begin{matrix}{( {y_{0} - y_{D\; 0}} ) + v_{D\; 0x}} \\\begin{matrix}{v_{0\; y} = {{\frac{( {v_{D\; 1y} - V_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{0} - X_{D\; 0}} )} + \frac{( {v_{D\; 1x} - V_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}}} \\{( {y_{0} - y_{D\; 0}} ) + v_{D\; 0y}}\end{matrix}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 16} \\\{ \begin{matrix}\begin{matrix}{v_{1x} = {{\frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{1} - x_{D\; 0}} )} - \frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}}} \\{( {y_{1} - y_{D\; 0}} ) + v_{D\; 0x}}\end{matrix} \\\begin{matrix}{v_{1y} = {{\frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{1} - x_{D\; 0}} )} + \frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}}} \\{( {y_{1} - y_{D\; 0}} ) + v_{D\; 0y}}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 17}\end{matrix}$

Implementations of this method effectively reduce the memory footprintof the affine motion data line buffer for the case of 4-parameter affinemotion models. In such implementations, the control point motion vectorsand associated reference picture indices are replaced by the regularmotion data that is already stored in the regular motion data linebuffer, and only the PU horizontal size may be additionally stored forthe affine merge mode. For 4K video with a picture width of 4096luminance samples and assuming the minimum PU width using affine mode is8, the size of the affine motion data line buffer can be reduced from9,344 bytes (i.e. 4096*(8+0.5)*2/8+4096*10/8/8)) to 320 bytes (i.e.4096*5/8/8).

A similar approach can be applied to the 6-parameter affine motionmodel. As shown in FIG. 7B, if the current PU selects a neighboring PUlocated at the top CTU boundary, e.g. PU E, as the source PU to inheritthe affine motion data, then the derivation of control point vectors{right arrow over (v₀)}, {right arrow over (v₁)} and {right arrow over(v₂)} of the current PU can be implemented in two steps. In the firststep, the sub-block motion vectors used for motion compensation of thebottom-left and bottom-right sub-block of PU E, i.e. {right arrow over(v_(LE0))}=(v_(LE0x),v_(LE0y)) at sample position (x_(LE0),y_(LE0)) and{right arrow over (v_(LE1))}=(v_(LE1x),v_(LE1y)) at the sample position(x_(LE1),y_(LE1)) with x_(LE0)=x_(E0) and y_(LE1)=y_(LE0), are computedby using the 6-parameter affine motion model:

$\begin{matrix}\{ \begin{matrix}\begin{matrix}{v_{{LE}\; 0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{L\; E\; 0} - x_{E\; 0}} )} + \frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}}} \\{( {y_{{LE}\; 0} - y_{E\; 0}} ) + v_{E\; 0x}}\end{matrix} \\\begin{matrix}{v_{{LE}\; 0y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 0} - x_{E\; 0}} )} + \frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}}} \\{( {y_{{LE}\; 0} - y_{E\; 0}} ) + v_{E\; 0y}}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 18} \\\{ \begin{matrix}\begin{matrix}{v_{{LE}\; 1x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{L\; E\; 1} - x_{E\; 0}} )} + \frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}}} \\{( {y_{{LE}\; 1} - y_{E\; 0}} ) + v_{E\; 0x}}\end{matrix} \\\begin{matrix}{v_{{LE}\; 1y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 1} - x_{E\; 0}} )} + \frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}}} \\{( {y_{{LE}\; 1} - y_{E\; 0}} ) + v_{{E\; 0y}\;}}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 19}\end{matrix}$

In the second step, the control point vectors {right arrow over (v₀)},{right arrow over (v₁)} and {right arrow over (v₂)} of the current PUcoded in affine merge mode are derived by using the 6-parameter affinemotion model, and by

$\begin{matrix}\{ \begin{matrix}\begin{matrix}{v_{0x} = {{\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{0} - x_{{LE}\; 0}} )} + \frac{( {v_{E\; 0x} - v_{{LE}\; 0x}} )}{( {y_{E\; 0} - y_{{LE}\; 0}} )}}} \\{( {y_{0} - y_{{LE}\; 0}} ) + v_{{LE}\; 0\; x}}\end{matrix} \\\begin{matrix}{v_{0y} = {{\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{0} - x_{{LE}\; 0}} )} + \frac{( {v_{E\; 0y} - v_{{LE}\; 0y}} )}{( {y_{E\; 0} - y_{{LE}\; 0}} )}}} \\{( {y_{0} - y_{{LE}\; 0}} ) + v_{{LE}\; 0_{y}}}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 20} \\\{ \begin{matrix}\begin{matrix}{v_{1x} = {{\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{1} - x_{{LE}\; 0}} )} + \frac{( {v_{E\; 0x} - v_{{LE}\; 0x}} )}{( {y_{E\; 0} - y_{{LE}\; 0}} )}}} \\{( {y_{1} - y_{{LE}\; 0}} ) + v_{{LE}\; 0x}}\end{matrix} \\\begin{matrix}{v_{1y} = {{\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{1} - x_{{LE}\; 0}} )} + \frac{( {v_{E\; 0y} - v_{{LE}\; 0y}} )}{( {y_{E\; 0} - y_{{LE}\; 0}} )}}} \\{( {y_{1} - y_{{LE}\; 0}} ) + v_{{LE}\; 0y}}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 21} \\\{ \begin{matrix}\begin{matrix}{v_{2x} = {{\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{2} - x_{{LE}\; 0}} )} + \frac{( {v_{E\; 0x} - v_{{LE}\; 0x}} )}{( {y_{E\; 0} - y_{{LE}\; 0}} )}}} \\{( {y_{2} - y_{{LE}\; 0}} ) + v_{{LE}\; 0x}}\end{matrix} \\\begin{matrix}{v_{2y} = {{\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{2} - x_{{LE}\; 0}} )} + \frac{( {v_{E\; 0y} - v_{{LE}\; 0y}} )}{( {y_{E\; 0} - y_{{LE}\; 0}} )}}} \\{( {y_{2} - y_{{LE}\; 0}} ) + v_{{LE}\; 0y}}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 22}\end{matrix}$

There are multiple ways of selecting sample positions for(x_(LE0),v_(LE0)) and (x_(LE1),y_(LE1)) for the selected neighboring PU(e.g. PU E). In the example depicted in FIG. 7B, x_(LE0)=x_(E0) andy_(LE1)=y_(LE0), satisfying the conditions of the 6-parameter affinemodel defined by Equation 20, Equation 21, and Equation 22. In anotherimplementation, x_(LE0)=x_(E2), y_(LE0)=y_(E2) and y_(LE1)=y_(LE0), suchthat the control point vector of the bottom-left corner of PU E isdirectly used for motion compensation of the sub-block and stored in theregular motion data line buffer.

If the selected neighboring PU is not located at the top CTU boundary,for example, if PU D in FIG. 7B is the selected neighboring PU for thecurrent PU coded in affine merge mode, then the {right arrow over (v₀)},{right arrow over (v₁)} and {right arrow over (v₂)} of the current PUmay be derived directly using the neighboring control point vectors,e.g., using {right arrow over (v_(D0))}=(v_(D0x),v_(D0y)) at thetop-left sample position (x_(D0),y_(D0)), {right arrow over(v_(D1))}=(v_(D1x),v_(D1y)) at the top-right sample position(x_(D1),y_(D1)) and {right arrow over (v_(D2))}=(v_(D2x),v_(D2y)) at thebottom-left sample position (x_(D1),y_(D1)) of the neighboring PU D, andby

$\begin{matrix}\{ \begin{matrix}\begin{matrix}{v_{0x} = {{\frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{0} - x_{D\; 0}} )} + {\frac{( {v_{D\; 2x} - v_{D\; 0x}} )}{( {y_{D\; 2} - y_{D\; 0}} )}( {y_{0} - y_{D\; 0}} )} +}} \\v_{D\; 0x}\end{matrix} \\\begin{matrix}{v_{{0y}\;} = {{\frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{0} - x_{D\; 0}} )} + {\frac{( {v_{D\; 2y} - v_{D\; 0y}} )}{( {y_{D\; 2} - y_{D\; 0}} )}( {y_{0} - y_{D\; 0}} )} +}} \\v_{D\; 0y}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 23} \\\{ \begin{matrix}\begin{matrix}{v_{1x} = {{\frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{1} - x_{D\; 0}} )} + {\frac{( {v_{D\; 2x} - v_{D\; 0x}} )}{( {y_{D\; 2} - y_{D\; 0}} )}( {y_{1} - y_{D\; 0}} )} +}} \\v_{D\; 0x}\end{matrix} \\\begin{matrix}{v_{1\; y} = {{\frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{1} - x_{D\; 0}} )} + {\frac{( {v_{D\; 2y} - v_{D\; 0y}} )}{( {y_{D\; 2} - y_{D\; 0}} )}( {y_{1} - y_{D\; 0}} )} +}} \\v_{D\; 0y}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 24} \\\{ \begin{matrix}\begin{matrix}{v_{2x} = {{\frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{2} - x_{D\; 0}} )} + {\frac{( {v_{D\; 2x} - v_{D\; 0x}} )}{( {y_{D\; 2} - y_{D\; 0}} )}( {y_{2} - y_{D\; 0}} )} +}} \\v_{D\; 0x}\end{matrix} \\\begin{matrix}{v_{2y} = {{\frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{2} - x_{D\; 0}} )} + {\frac{( {v_{D\; 2y} - v_{D\; 0y}} )}{( {y_{D\; 2} - y_{D\; 0}} )}( {y_{2} - y_{D\; 0}} )} +}} \\v_{D\; 0\; y}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 25}\end{matrix}$

In some implementations using the 6-parameter affine motion model, onlytwo control point vectors can be replaced by the motion data stored inthe regular motion data line buffer; the third control point vectorrequired by the 6-parameter model, e.g., either the top-left ortop-right control point vector of a PU, may be stored in the affinemotion data line buffer. In such implementations, both the PU width andheight may also be stored. Nonetheless, this still results insignificant memory savings. For 4K video with picture width of 4096luminance samples and assuming the minimum PU width using affine mode is8, the size of affine motion data line buffer has been reduced from13,440 bytes (i.e. 4096*(12+0.5)*2/8+4096*10/8/8)) to 4,736 bytes (i.e.4096*4*2/8+4096*10/8/8)).

Although discussed primarily as serial operations, in someimplementations, for the affine merge mode, instead of a sequentialprocess of deriving the control point vectors from the neighboringaffine motion data for the current PU, followed by deriving sub-blockmotion data of the current PU by using the derived control pointvectors, a parallel process can be used in which both the derivation ofcontrol point vectors and the derivation of sub-block motion data forthe current PU directly use the neighboring affine motion data. Forexample, for a 4-parameter model as shown in FIG. 8A, if the current PUof the affine merge mode inherits affine motion data from a neighboringPU E, the same control point vectors of neighboring PU E (e.g., {rightarrow over (v_(E0))}=(v_(E0x),v_(E0y)) and {right arrow over(v_(E1))}=(v_(E1x),v_(E1y))) may be used for derivation of the controlpoint vectors of the current PU (e.g., {right arrow over(v₀)}=(v_(0x),v_(0y)) at the top-left corner position (x₀,y₀) and {rightarrow over (v₁)}=(v_(1x),v_(1y)) at the top-right corner position(x₁,y₁), as well as the sub-block motion vector {right arrow over(v)}=(v_(x),v_(y)) at a sub-block location (x,y) inside the current PU),and by:

$\begin{matrix}\{ \begin{matrix}\begin{matrix}{v_{0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} +}} \\v_{E\; 0x}\end{matrix} \\\begin{matrix}{v_{{0y}\;} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} +}} \\v_{E\; 0y}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 26} \\\{ \begin{matrix}\begin{matrix}{v_{1x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} +}} \\v_{E\; 0x}\end{matrix} \\\begin{matrix}{v_{1\; y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} +}} \\v_{E\; 0y}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 27} \\\{ \begin{matrix}\begin{matrix}{v_{x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x - x_{E\; 0}} )} + {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y - y_{E\; 0}} )} +}} \\v_{E\; 0x}\end{matrix} \\\begin{matrix}{v_{y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y - y_{E\; 0}} )} +}} \\v_{E\; 0\; y}\end{matrix}\end{matrix}  & {{Equation}\mspace{14mu} 28}\end{matrix}$

In some implementations, the derivation of control point vectors and thederivation of sub-block vectors are separated into two steps. In thefirst step, the control point vectors of the current PU, e.g., {rightarrow over (v₀)}=(v_(0x),v_(0y)) at the top-left corner position (x₀,y₀)and v₁=(v_(1x),v_(1y)) at the top-right corner position (x₁,y₁), arederived by using the following Equations:

$\begin{matrix}\{ \begin{matrix}{v_{0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} - {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{0y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 29} \\\{ \begin{matrix}{v_{1x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} - {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{1y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 30}\end{matrix}$In the second step, the sub-block motion vector {right arrow over(v)}=(v_(x),v_(y)) at a sub-block location (x,y) inside the current PUis computed by the derived control point vectors {right arrow over (v₀)}and {right arrow over (v₁)}, and by

$\begin{matrix}\{ \begin{matrix}{v_{x} = {{\frac{( {v_{1x} - v_{0x}} )}{( {x_{1} - x_{0}} )}( {x - x_{0}} )} - {\frac{( {v_{1y} - v_{0y}} )}{( {x_{1} - x_{0}} )}( {y - y_{\; 0}} )} + v_{0x}}} \\{v_{y} = {{\frac{( {v_{1y} - v_{0y}} )}{( {x_{1} - x_{0}} )}( {x - x_{0}} )} + {\frac{( {v_{1x} - v_{0x}} )}{( {x_{1} - x_{0}} )}( {y - y_{0}} )} + v_{0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 31}\end{matrix}$The similar parallel process of derivation of control point vectors andsub-block motion data for the current PU coded in affine merge mode canalso be implemented for other types of affine motion model (e.g. the6-parameter model).

Although the proposed method is mainly described for the 4-parameter and6-parameter affine motion models, the same approach can be applied toother affine motion models, such as 3-parameter affine motion modelsused for zooming or rotation only.

FIG. 8B is an illustration of a shared motion data line buffer foraffine merge and non-affine merge/skip mode, for a 6-parameter affinemotion model, modified from the implementation of FIG. 7B. As shown,vectors {right arrow over (v_(LE0))}, {right arrow over (v_(LB0))}, and{right arrow over (v_(LC0))} are shifted relative to their positions inthe implementation of FIG. 7B. Specifically, in the case of the6-parameter affine model, a significant amount of line buffer storage isstill utilized because of required storage of either the top-left ortop-right control point vector in addition to sharing the motion dataline buffer. To further reduce the line buffer footprint in the6-parameter model case, the following approach can be applied.

As shown in FIG. 8B, if the current PU selects a neighboring PU locatedat the top CTU boundary, e.g. PU E, as the source PU to inherit theaffine motion data, then the derivation of control point vectors {rightarrow over (v₀)}, {right arrow over (v₁)} and {right arrow over (v₂)} ofthe current PU can be implemented in two steps.

In the first step, the sub-block motion vectors used for motioncompensation of the bottom-left and bottom-right sub-block of PU E, i.e.{right arrow over (v_(LE0))}=(v_(LE0x),v_(LE0y)) at sample position(x_(LE0),y_(LE0)) and {right arrow over (v_(LE1))}=(v_(LE1x),v_(LE1y))at the sample position (x_(LE1),y_(LE1)) with y_(LE1)=y_(LE0), arecomputed by using the 6-parameter affine motion model:

$\begin{matrix}\{ \begin{matrix}{v_{{LE}\; 0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LE}\; 0} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{{LE}\; 0y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LE}\; 0} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 32} \\\{ \begin{matrix}{v_{{LE}\; 1x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LE}\; 1} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{{LE}\; 1y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LE}\; 1} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 33}\end{matrix}$

In the second step, in some implementations, the control point vectors{right arrow over (v₀)}, {right arrow over (v₁)} and {right arrow over(v₂)} of the current PU coded in affine merge mode are derived by usingthe 4-parameter affine motion model (instead of the 6-parameter model),by:

$\begin{matrix}\{ \begin{matrix}{v_{0x} = {{\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{0} - x_{{LE}\; 0}} )} - {\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {y_{0} - y_{{LE}\; 0}} )} + v_{{LE}\; 0x}}} \\{v_{0y} = {{\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{0} - x_{{LE}\; 0}} )} + {\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {y_{0} - y_{{LE}\; 0}} )} + v_{{LE}\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 34} \\\{ \begin{matrix}{v_{1x} = {{\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{1} - x_{{LE}\; 0}} )} - {\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {y_{1} - y_{{LE}\; 0}} )} + v_{{LE}\; 0x}}} \\{v_{1y} = {{\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{1} - x_{{LE}\; 0}} )} + {\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {y_{1} - y_{{LE}\; 0}} )} + v_{{LE}\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 35} \\\{ \begin{matrix}{v_{2x} = {{\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{2} - x_{{LE}\; 0}} )} - {\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {y_{2} - y_{{LE}\; 0}} )} + v_{{LE}\; 0x}}} \\{v_{2y} = {{\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{2} - x_{{LE}\; 0}} )} + {\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {y_{2} - y_{{LE}\; 0}} )} + v_{{LE}\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 36}\end{matrix}$

If the selected neighboring PU is not located at the top CTUboundary—for example, if PU D in FIG. 8B is the selected neighboring PUfor the current PU coded in affine merge mode—then the {right arrow over(v₀)}, {right arrow over (v₁)} and {right arrow over (v₂)} of thecurrent PU are derived by directly using the neighboring control pointvectors, i.e. using {right arrow over (v_(D0))}=(v_(D0x),v_(D0y)) at thetop-left sample position (x_(D0),y_(D0)), {right arrow over(v_(D1))}=(v_(D1x),v_(D1y)) at the top-right sample position(x_(D1),y_(D1)) and {right arrow over (v_(D2))}=(v_(D2x),v_(D2y)) at thebottom-left sample position (x_(D1),y_(D1)) of the neighboring PU D, andby:

$\begin{matrix}\{ \begin{matrix}{v_{0x} = {{\frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{0} - x_{D\; 0}} )} - {\frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {y_{0} - y_{D\; 0}} )} + v_{D\; 0x}}} \\{v_{0y} = {{\frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{0} - x_{D\; 0}} )} + {\frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {y_{0} - y_{D\; 0}} )} + v_{D\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 37} \\\{ \begin{matrix}{v_{1x} = {{\frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{1} - x_{D\; 0}} )} - {\frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {y_{1} - y_{D\; 0}} )} + v_{D\; 0x}}} \\{v_{1y} = {{\frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{1} - x_{D\; 0}} )} + {\frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {y_{1} - y_{D\; 0}} )} + v_{D\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 38} \\\{ \begin{matrix}{v_{2x} = {{\frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{2} - x_{D\; 0}} )} - {\frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {y_{2} - y_{D\; 0}} )} + v_{D\; 0x}}} \\{v_{2y} = {{\frac{( {v_{D\; 1y} - v_{D\; 0y}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {x_{2} - x_{D\; 0}} )} + {\frac{( {v_{D\; 1x} - v_{D\; 0x}} )}{( {x_{D\; 1} - x_{D\; 0}} )}( {y_{2} - y_{D\; 0}} )} + v_{D\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 39}\end{matrix}$

This simplified method also works for affine merge mode with adaptiveselection of affine motion model at the PU level (e.g. adaptive4-parameter and 6-parameter model at PU level). As long as the4-parameter model (as used above in Equations 34, 35 and 36) is used todrive the control point vectors for the current PU for the case in whichthe selected neighboring PU uses the 6-parameter model and at the topCTU boundary, the additional storage of the top-left or top-rightcontrol point vectors and PU height of the selected PU can be avoided.

With this simplified method, the line buffer size can be even furtherreduced. For 4K video with a picture width of 4096 luminance samples andassuming a minimum PU width using affine mode is 8, the size of affinemotion data line buffer has been reduced from 13,440 bytes (i.e.4096*(12+0.5)*2/8+4096*10/8/8)) to 320 bytes (i.e. 4096*5/8/8)).

In implementations using a 6-parameter affine model, if the neighboringPU width is large enough, the 6-parameter affine model may be used forthe derivation of control point vectors of the current PU. Depending onthe neighboring PU width, an adaptive 4- and 6-parameter affine motionmodel may be used to derive the control point vectors of the current PU.

FIG. 9A is an illustration of a shared motion data line buffer foraffine merge and non-affine merge/skip mode, for an adaptive affinemotion model, according to some implementations. As shown in FIG. 9A, ifthe current PU selects a neighboring PU located at the top CTU boundary,e.g. PU E, as the source PU to inherit the affine motion data, then thederivation of control point vectors {right arrow over (v₀)}, {rightarrow over (v₁)} and {right arrow over (v₂)} of the current PU may beimplemented in two steps, according to the following implementation.

In the first step, the sub-block motion vectors used for motioncompensation of the bottom-left and bottom-right sub-block of PU E, i.e.{right arrow over (v_(LE0))}=(v_(LE0x),v_(LE0y)) at sample position(x_(LE0),y_(LE0)) and {right arrow over (v_(LE1))}=(v_(LE1x),v_(LE1y))at the sample position (x_(LE1),v_(LE1)) with are computed by using the6-parameter affine motion model:

$\begin{matrix}\{ \begin{matrix}{v_{{LE}\; 0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LE}\; 0} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{{LE}\; 0y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LE}\; 0} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 40} \\\{ \begin{matrix}{v_{{LE}\; 1x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LE}\; 1} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{{LE}\; 1y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LE}\; 1} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 41}\end{matrix}$

Furthermore, if the neighboring PU E is wide enough, then additionalsub-block vectors may be already stored in the regular motion data linebuffer. For example, if the PU E has a width larger than or equal to 16samples, and the sub-block width is 4 samples, then at least 4 bottomsub-block vectors of PU E are stored in the regular motion data linebuffer. As shown in FIG. 9A, in this case the sub-block vectors atbottom center positions of the neighboring PU E ({right arrow over(v_(LEc0))}=(v_(LEc0x),v_(LEc0y)) at sample position (x_(LEc0),v_(LEc0))and {right arrow over (v_(LEc1))}=(v_(LEc1x),v_(LEc1y)) at the sampleposition (x_(LEc1),v_(LEc1)) with y_(LEc1)=y_(LEc0)) are computed byusing the 6-parameter affine motion model:

$\begin{matrix}\{ \begin{matrix}{v_{{LEc}\; 0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LEc}\; 0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LEc}\; 0} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{{LEc}\; 0y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LEc}\; 0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LEc}\; 0} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 42} \\\{ \begin{matrix}{v_{{LEc}\; 1x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LEc}\; 1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LEc}\; 1} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{{LEc}\; 1y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LEc}\; 1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LEc}\; 1} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 43}\end{matrix}$

In the second step, the control point vectors {right arrow over (v₀)},{right arrow over (v₁)} and {right arrow over (v₂)} of the current PUcoded in affine merge mode are derived by using the 6-parameter affinemotion model, and by

$\begin{matrix}\{ \begin{matrix}{v_{0x} = {{\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{0} - x_{{LE}\; 0}} )} + {\frac{( {v_{{LE}\; 1x} + v_{{LE}\; 0x} - v_{{LEc}\; 1x} - v_{{LEc}\; 0x}} )}{2( {y_{{LE}\; 1} - y_{{LEc}\; 1}} )}( {y_{0} - y_{{LE}\; 0}} )} + v_{{LE}\; 0x}}} \\{v_{0y} = {{\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{0} - x_{{LE}\; 0}} )} + {\frac{( {v_{{LE}\; 1y} + v_{{LE}\; 0y} - v_{{LEc}\; 1y} - v_{{LEc}\; 0y}} )}{2( {y_{{LE}\; 1} - y_{{LEc}\; 1}} )}( {y_{0} - y_{{LE}\; 0}} )} + v_{{LE}\; 0y}}}\end{matrix}  & {{Eq}.\mspace{14mu} 44} \\\{ \begin{matrix}{v_{1x} = {{\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{1} - x_{{LE}\; 0}} )} + {\frac{( {v_{{LE}\; 1x} + v_{{LE}\; 0x} - v_{{LEc}\; 1x} - v_{{LEc}\; 0x}} )}{2( {y_{{LE}\; 1} - y_{{LEc}\; 1}} )}( {y_{1} - y_{{LE}\; 0}} )} + v_{{LE}\; 0x}}} \\{v_{1y} = {{\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{1} - x_{{LE}\; 0}} )} + {\frac{( {v_{{LE}\; 1y} + v_{{LE}\; 0y} - v_{{LEc}\; 1y} - v_{{LEc}\; 0y}} )}{2( {y_{{LE}\; 1} - y_{{LEc}\; 1}} )}( {y_{1} - y_{{LE}\; 0}} )} + v_{{LE}\; 0y}}}\end{matrix}  & {{Eq}.\mspace{14mu} 45} \\\{ \begin{matrix}{v_{2x} = {{\frac{( {v_{{LE}\; 1x} - v_{{LE}\; 0x}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{2} - x_{{LE}\; 0}} )} + {\frac{( {v_{{LE}\; 1x} + v_{{LE}\; 0x} - v_{{LEc}\; 1x} - v_{{LEc}\; 0x}} )}{2( {y_{{LE}\; 1} - y_{{LEc}\; 1}} )}( {y_{2} - y_{{LE}\; 0}} )} + v_{{LE}\; 0x}}} \\{v_{2y} = {{\frac{( {v_{{LE}\; 1y} - v_{{LE}\; 0y}} )}{( {x_{{LE}\; 1} - x_{{LE}\; 0}} )}( {x_{2} - x_{{LE}\; 0}} )} + {\frac{( {v_{{LE}\; 1y} + v_{{LE}\; 0y} - v_{{LEc}\; 1y} - v_{{LEc}\; 0y}} )}{2( {y_{{LE}\; 1} - y_{{LEc}\; 1}} )}( {y_{2} - y_{{LE}\; 0}} )} + v_{{LE}\; 0y}}}\end{matrix}  & {{Eq}.\mspace{14mu} 46}\end{matrix}$

Note that the selection of sub-block vector sample location must satisfythe following conditions to make the 6-parameter affine motion modelbased inheritance work.

$\begin{matrix}\{ \begin{matrix}{y_{{LE}\; 1} = y_{{LE}\; 0}} \\{y_{{LEc}\; 1} = y_{{LEc}\; 0}} \\{y_{{LE}\; 1} \neq y_{{LEc}\; 1}} \\{{x_{{LE}\; 0} + x_{{LE}\; 1}} = {x_{{LEc}\; 0} + x_{{LEc}\; 1}}}\end{matrix}  & {{Equation}\mspace{14mu} 47}\end{matrix}$

If the selected neighboring PU is located at the top CTU boundary but PUis not wide enough. For example, if PU E has a width of 8 samples, andthe sub-block width is 4 samples, then only 2 bottom sub-block vectorsof PU E, i.e. v_(LE0)=(v_(LE0x),v_(LE0y)) and {right arrow over(v_(LE1))}=(v_(LE1x),v_(LE1y)), can be stored in the regular motion dataline buffer. In this case, the 4-parameter motion model as described inEquations 34, 35 and 36 are used to derive the control point vectors{right arrow over (v₀)}, {right arrow over (v₁)} and {right arrow over(v₂)} and of the current PU. In some of implementations, the current PUmay be treated using the 4-parameter affine motion model, though itinherits the affine motion data from a neighboring PU using 6-parameteraffine motion model. For example, in some such implementations, thecontrol point vectors of the {right arrow over (v₀)}, {right arrow over(v₁)} of the current PU are derived by using Equations 34 and 35. Inother implementations, the inheritance of affine motion data in thiscase may be simply disabled for the current PU.

If the selected neighboring PU is not located at the top CTUboundary—for example, if PU D in FIG. 9A is the selected neighboring PUfor the current PU coded in affine merge mode—then the {right arrow over(v₀)}, {right arrow over (v₁)} and {right arrow over (v₂)} of thecurrent PU may be derived by directly using the neighboring controlpoint vectors, e.g. using {right arrow over (v_(D0))}=(v_(D0x),v_(D0y))at the top-left sample position (x_(D0),v_(D0)), {right arrow over(v_(D1))}=(v_(D1x),v_(D1y)) at the top-right sample position(x_(D1),y_(D1)) and {right arrow over (v_(D2))}=(v_(D2x),v_(D2y)) at thebottom-left sample position (x_(D1),y_(D1)) of the neighboring PU D, asdescribed in Equation 37, 38 and 39.

In some implementations, the control point vectors and PU sizes of theneighboring PUs, which are located along the top CTU boundary and codedin affine mode, may be directly stored in the regular motion data linebuffer to avoid the need of using a separate line buffer to buffer thecontrol point vectors and PU sizes of those PUs. FIG. 9B is anillustration of a shared motion data line buffer for affine merge andnon-affine merge/skip mode, with control point vectors stored in aregular motion data line buffer, according to some implementations. Asshown in FIG. 9B, the current PU uses a 6-parameter affine motion model,while the neighboring PUs along the top CTU boundary (e.g. PU E, B, C)may use the 6-parameter or 4-parameter affine motion models. Theassociated control point vectors (e.g. {right arrow over (v_(E0))},{right arrow over (v_(E1))} and {right arrow over (v_(E2))} of PU E,{right arrow over (v_(B0))} and {right arrow over (vE_(B1))} of PUB, and{right arrow over (v_(B0))} and {right arrow over (v_(B0))} and {rightarrow over (v_(B1))} of PU C) are directly stored in the regular motiondata line buffer.

With the control point vectors stored in the regular motion data linebuffer, the affine motion data inheritance is straightforward. In thisembodiment, it makes no difference whether the selected PU is along thetop CTU boundary or not. For example, if the current PU inherits theaffine motion data from PU E in FIG. 9B, and PU E uses the 6 parametermodel, then the control point vectors {right arrow over (v)}₀, {rightarrow over (v)}₁ (and {right arrow over (v)}₂) of the current PU codedin affine merge mode can be derived by:

$\begin{matrix}\{ \begin{matrix}{v_{0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{0y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 48} \\\{ \begin{matrix}{v_{1x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{1y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 49}\end{matrix}$And if the current PU uses 6-parameter affine motion model, by:

$\begin{matrix}\{ \begin{matrix}{v_{2x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{2} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{2} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{2y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{2} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{2} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 50}\end{matrix}$

Likewise, if PU E uses the 4-parameter affine motion model, then thecontrol point vectors {right arrow over (v₀)}, {right arrow over (v₁)}(and {right arrow over (v₂)}) of the current PU coded in affine mergecan be derived by:

$\begin{matrix}\{ \begin{matrix}{v_{0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} - {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{0y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{0} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 51} \\\{ \begin{matrix}{v_{1x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} - {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{1y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{1} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{1} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 52}\end{matrix}$And if the current PU uses 6-parameter affine motion model, by:

$\begin{matrix}\{ \begin{matrix}{v_{2x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{2} - x_{E\; 0}} )} - {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{2} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{2y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{2} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{2} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 53}\end{matrix}$

For the merge/skip, AMVP and affine AMVP list derivation of the currentPU, spatial neighboring sub-block motion vectors may be used, but theyare not readily stored in the regular motion data line buffer in manyimplementations. Instead, a local motion data buffer for a CTU isinstalled to buffer the bottom sub-block vectors of the PUs along thetop CTU boundary. If a neighboring PU along the top CTU boundary usesaffine mode, the bottom sub-block vectors are computed by using thecontrol point vectors stored in the regular motion line buffer. FIG. 9Cis an illustration of a shared motion data line buffer for affine mergeand non-affine merge/skip utilizing a local motion data buffer,according to some implementations. In FIG. 9C, for example, if PU E uses6-parameter affine motion model, the bottom sub-block vectors of PU E,e.g. {right arrow over (v_(LE0))}=(v_(LE0x),v_(LE0y)) at sample position(x_(LE0),y_(LE0)), etc., may be computed by using the 6-parameter affinemotion model and stored in the local motion buffer:

$\begin{matrix}\{ \begin{matrix}{v_{{LE}\; 0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2x} - v_{E\; 0x}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LE}\; 0} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{{LE}\; 0y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 2y} - v_{E\; 0y}} )}{( {y_{E\; 2} - y_{E\; 0}} )}( {y_{{LE}\; 0} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 54}\end{matrix}$

Likewise, if PU E uses 4-parameter affine motion model, sub-blockvectors, e.g. {right arrow over (v_(LE0))}=(v_(LE0x),v_(LE0y)) at sampleposition (x_(LE0),y_(LE0)) and etc., may be computed by using the4-parameter affine motion model and stored in the local motion databuffer:

$\begin{matrix}\{ \begin{matrix}{v_{{LE}\; 0x} = {{\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 0} - x_{E\; 0}} )} - {\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{{LE}\; 0} - y_{E\; 0}} )} + v_{E\; 0x}}} \\{v_{{LE}\; 0y} = {{\frac{( {v_{E\; 1y} - v_{E\; 0y}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {x_{{LE}\; 0} - x_{E\; 0}} )} + {\frac{( {v_{E\; 1x} - v_{E\; 0x}} )}{( {x_{E\; 1} - x_{E\; 0}} )}( {y_{{LE}\; 0} - y_{E\; 0}} )} + v_{E\; 0y}}}\end{matrix}  & {{Equation}\mspace{14mu} 55}\end{matrix}$In such embodiments, the current PU uses sub-block vectors stored in thelocal motion data buffer (instead of the regular motion line buffer,which stores control point vectors) for the merge/skip, AMVP and affineAMVP list derivation. The derived sub-block vectors of PUs coded inaffine mode may also be stored as temporal motion vectors for use offuture pictures.

In some implementations, the 6-parameter affine mode may be disabled forPUs of small PU width so that the regular motion data line buffer hasenough space to store control point vectors. For example, if thesub-block width is 4, then the 6-parameter affine mode may be disabledfor PUs of width less than or equal to 8 samples. For example, in someimplementations, for a PU with width of 8, only two sub-block slots areavailable in the regular motion data line buffer for the PU to storecontrol point vectors, but the PU coded in the 6-parameter affine modeneeds to store 3 control point vectors. Disabling the 6-parameter affinemode may be used to address lower width PUs.

FIG. 10 is a flow chart of a method for decoding video via an adaptiveaffine motion model, according to some implementations. During decodingof a prediction unit of a coding unit, as discussed above, at step 1000the decoder may select one or more neighboring prediction units asmotion vector references. As discussed above, in some implementations,the selected neighboring prediction unit may be at the top boundary ofthe current coding tree unit (CTU). The decoder may determine this by,in some implementations, determining if a sum of a y component of a lumalocation specifying a top-left sample of the neighboring luma codingblock relative to the top left luma sample of the current picture (yNb)and a height of the neighboring luma coding block (nNbH) modulo anvertical array size of the luma coding tree block in samples (CtbSizeY)is equal to zero (e.g. ((yNb+nNbH) % CtbSizeY)=0). If it is, then thedecoder may also determine if the sum of yNb and nNbH is equal to a ycomponent of a luma location specifying the top-left sample of thecurrent coding block relative to the top-left luma sample of the currentpicture (yCb) (i.e. yNb+nNbH=yCb). If so, then the neighboring lumacoding block is at the top boundary of the current coding tree unit.

If the neighboring luma coding block is not at the top boundary of thecurrent coding tree unit, at step 1004, the decoder may determinesub-block motion vectors based on control point motion vectors.Conversely, if the neighboring luma coding block is at the top boundaryof the current coding tree unit, then, at step 1006, the decoder maydetermine sub-block motion vectors based on the neighboring sub-blockvectors that are stored in the regular motion data line buffer. In someimplementations, this derivation of sub-block motion vectors based onneighboring sub-block vectors may be done via the calculation ofequations 34-36 discussed above, or any of the similar sets of equationsabove, depending on implementation.

Once motion vectors have been derived, in some implementations, at step1008, the prediction unit may be decoded as discussed above. At step1010, the prediction unit may be provided as an output of the decoder(e.g. as part of a reconstructed picture for display).

Accordingly, the systems and methods discussed herein provide forsignificant reduction in memory utilization while providing highefficiency derivation of motion data for affine merge mode. In a firstaspect, the present disclosure is directed to a method for reducedmemory utilization for motion data derivation in encoded video. Themethod includes determining, by a video decoder of a device from aninput video bitstream, one or more control point motion vectors of afirst prediction unit of a first coding tree unit, based on a pluralityof motion vectors of a second one or more prediction units neighboringthe first prediction unit stored in a motion data line buffer of thedevice. The method also includes decoding, by the video decoder, one ormore sub-blocks of the first prediction unit based on the determined oneor more control point motion vectors.

In some implementations, the second one or more prediction units arefrom a second coding tree unit neighboring the first coding tree unit.In a further implementation, the second one or more prediction units arelocated at a top boundary of the first coding tree unit. In a stillfurther implementation, the motion vectors of the second one or moreprediction units are stored in the motion data line buffer of the deviceduring decoding of the first coding tree unit. In another furtherimplementation, the method includes deriving the one or more controlpoint motion vectors of the first prediction unit proportional to anoffset between a sample position of the first prediction unit and asample position of the second one or more prediction units. In yetanother further implementation, the method includes determining, by thevideo decoder, a second one or more control point motion vectors ofanother prediction unit of the first coding tree unit based on controlpoint motion vectors, responsive to a third one or more prediction unitsneighboring the another prediction unit not being located at a topboundary of the first coding tree unit; and decoding, by the videodecoder, one or more sub-blocks of the another prediction unit based onthe determined second one or more control point motion vectors

In some implementations, the method includes calculating a differencebetween a control point motion vector and motion data of the second oneor more prediction units neighboring the first prediction unit. In someimplementations, the method includes calculating an offset from themotion data of the second one or more prediction units neighboring thefirst prediction unit based on a height or width of the correspondingsecond one or more prediction units. In a further implementation, anidentification the height or width of the corresponding second one ormore prediction units is stored in an affine motion data line buffer.

In some implementations, the method includes deriving sub-block motiondata of the one or more sub-blocks based on the determined one or morecontrol point motion vectors. In some implementations, the methodincludes providing, by the video decoder to a display device, thedecoded one or more sub-blocks of the first prediction unit.

In another aspect, the present disclosure is directed to a system forreduced memory utilization for motion data derivation in encoded video.The system includes a motion data line buffer; and a video decoder,configured to: determine, device from an input video bitstream, one ormore control point motion vectors of a first prediction unit of a firstcoding tree unit, based on a plurality of motion vectors of a second oneor more prediction units neighboring the first prediction unit stored inthe motion data line buffer, and decode one or more sub-blocks of thefirst prediction unit based on the determined one or more control pointmotion vectors.

In some implementations, the second one or more prediction units arefrom a second coding tree unit neighboring the first coding tree unit.In some implementations, the second one or more prediction units arelocated at a top boundary of the first coding tree unit. In a furtherimplementation, the motion vectors of the second one or more predictionunits are stored in the motion data line buffer of the device duringdecoding of the first coding tree unit. In another implementation, thedecoder is further configured to derive the one or more control pointmotion vectors of the first prediction unit proportional to an offsetbetween a sample position of the first prediction unit and a sampleposition of the second one or more prediction units. In anotherimplementation, the decoder is further configured to: determine a secondone or more control point motion vectors of another prediction unit ofthe first coding tree unit based on control point motion vectors,responsive to a third one or more prediction units neighboring theanother prediction unit not being located at a top boundary of the firstcoding tree unit; and decode one or more sub-blocks of the anotherprediction unit based on the determined second one or more control pointmotion vectors.

In some implementations, the decoder is further configured to calculatea difference between a control point motion vector and motion data ofthe second one or more prediction units neighboring the first predictionunit. In some implementations, the decoder is further configured tocalculate an offset from the motion data of the second one or moreprediction units neighboring the first prediction unit based on a heightor width of the corresponding second one or more prediction units. Insome implementations, the system includes an affine motion data linebuffer configured to store an identification the height or width of thecorresponding second one or more prediction units.

In some implementations, the decoder is further configured to derivesub-block motion data of the one or more sub-blocks based on thedetermined one or more control point motion vectors. In someimplementations, the decoder is further configured to provide, to adisplay device, the decoded one or more sub-blocks of the firstprediction unit.

B. Computing and Network Environment

Having discussed specific embodiments of the present solution, it may behelpful to describe aspects of the operating environment as well asassociated system components (e.g., hardware elements) in connectionwith the methods and systems described herein. Referring to FIG. 11A, anembodiment of a network environment is depicted. In brief overview, thenetwork environment includes a wireless communication system thatincludes one or more access points 1106, one or more wirelesscommunication devices 1102 and a network hardware component 1192. Thewireless communication devices 1102 may for example include laptopcomputers 1102, tablets 1102, personal computers 1102 and/or cellulartelephone devices 1102. The details of an embodiment of each wirelesscommunication device and/or access point are described in greater detailwith reference to FIGS. 11B and 11C. The network environment can be anad hoc network environment, an infrastructure wireless networkenvironment, a subnet environment, etc. in one embodiment

The access points (APs) 1106 may be operably coupled to the networkhardware 1192 via local area network connections. The network hardware1192, which may include a router, gateway, switch, bridge, modem, systemcontroller, appliance, etc., may provide a local area network connectionfor the communication system. Each of the access points 1106 may have anassociated antenna or an antenna array to communicate with the wirelesscommunication devices 1102 in its area. The wireless communicationdevices 1102 may register with a particular access point 1106 to receiveservices from the communication system (e.g., via a SU-MIMO or MU-MIMOconfiguration). For direct connections (e.g., point-to-pointcommunications), some wireless communication devices 1102 maycommunicate directly via an allocated channel and communicationsprotocol. Some of the wireless communication devices 1102 may be mobileor relatively static with respect to the access point 1106.

In some embodiments an access point 1106 includes a device or module(including a combination of hardware and software) that allows wirelesscommunication devices 1102 to connect to a wired network using Wi-Fi, orother standards. An access point 1106 may sometimes be referred to as awireless access point (WAP). An access point 1106 may be configured,designed and/or built for operating in a wireless local area network(WLAN). An access point 1106 may connect to a router (e.g., via a wirednetwork) as a standalone device in some embodiments. In otherembodiments, an access point can be a component of a router. An accesspoint 1106 can provide multiple devices 1102 access to a network. Anaccess point 1106 may, for example, connect to a wired Ethernetconnection and provide wireless connections using radio frequency linksfor other devices 1102 to utilize that wired connection. An access point1106 may be built and/or configured to support a standard for sendingand receiving data using one or more radio frequencies. Those standards,and the frequencies they use may be defined by the IEEE (e.g., IEEE802.11 standards). An access point may be configured and/or used tosupport public Internet hotspots, and/or on an internal network toextend the network's Wi-Fi signal range.

In some embodiments, the access points 1106 may be used for (e.g.,in-home or in-building) wireless networks (e.g., IEEE 802.11, Bluetooth,ZigBee, any other type of radio frequency based network protocol and/orvariations thereof). Each of the wireless communication devices 1102 mayinclude a built-in radio and/or is coupled to a radio. Such wirelesscommunication devices 1102 and/or access points 1106 may operate inaccordance with the various aspects of the disclosure as presentedherein to enhance performance, reduce costs and/or size, and/or enhancebroadband applications. Each wireless communication devices 1102 mayhave the capacity to function as a client node seeking access toresources (e.g., data, and connection to networked nodes such asservers) via one or more access points 1106.

The network connections may include any type and/or form of network andmay include any of the following: a point-to-point network, a broadcastnetwork, a telecommunications network, a data communication network, acomputer network. The topology of the network may be a bus, star, orring network topology. The network may be of any such network topologyas known to those ordinarily skilled in the art capable of supportingthe operations described herein. In some embodiments, different types ofdata may be transmitted via different protocols. In other embodiments,the same types of data may be transmitted via different protocols.

The communications device(s) 1102 and access point(s) 1106 may bedeployed as and/or executed on any type and form of computing device,such as a computer, network device or appliance capable of communicatingon any type and form of network and performing the operations describedherein. FIGS. 11B and 11C depict block diagrams of a computing device1100 useful for practicing an embodiment of the wireless communicationdevices 1102 or the access point 1106. As shown in FIGS. 11B and 11C,each computing device 1100 includes a central processing unit 1121, anda main memory unit 1122. As shown in FIG. 11B, a computing device 1100may include a storage device 1128, an installation device 1116, anetwork interface 1118, an I/O controller 1123, display devices 1124a-1124 n, a keyboard 1126 and a pointing device 1127, such as a mouse.The storage device 1128 may include, without limitation, an operatingsystem and/or software. As shown in FIG. 11C, each computing device 1100may also include additional optional elements, such as a memory port1103, a bridge 1170, one or more input/output devices 1130 a-1130 n(generally referred to using reference numeral 1130), and a cache memory1140 in communication with the central processing unit 1121.

The central processing unit 1121 is any logic circuitry that responds toand processes instructions fetched from the main memory unit 1122. Inmany embodiments, the central processing unit 1121 is provided by amicroprocessor unit, such as: those manufactured by Intel Corporation ofMountain View, Calif.; those manufactured by International BusinessMachines of White Plains, N.Y.; or those manufactured by Advanced MicroDevices of Sunnyvale, Calif. The computing device 1100 may be based onany of these processors, or any other processor capable of operating asdescribed herein.

Main memory unit 1122 may be one or more memory chips capable of storingdata and allowing any storage location to be directly accessed by themicroprocessor 1121, such as any type or variant of Static random accessmemory (SRAM), Dynamic random access memory (DRAM), Ferroelectric RAM(FRAM), NAND Flash, NOR Flash and Solid State Drives (SSD). The mainmemory 1122 may be based on any of the above described memory chips, orany other available memory chips capable of operating as describedherein. In the embodiment shown in FIG. 11B, the processor 1121communicates with main memory 1122 via a system bus 1150 (described inmore detail below). FIG. 11C depicts an embodiment of a computing device1100 in which the processor communicates directly with main memory 1122via a memory port 1103. For example, in FIG. 11C the main memory 1122may be DRDRAM.

Processor 1121 and/or main memory 1122 may be used for video encodingand/or decoding, as well as other video processing features (includingprocessing of animations, slide shows, or other multimedia). Forexample, main memory 1122 may comprise memory buffers needed for asoftware/hardware codec for VVC encoding and/or decoding. Processor 1121may comprise a software/hardware VVC encoder and/or decoder; communicatewith a separate co-processor comprising a VVC encoder and/or decoder;and/or execute instructions for encoding and decoding media stored inmain memory 1122.

FIG. 11C depicts an embodiment in which the main processor 1121communicates directly with cache memory 1140 via a secondary bus,sometimes referred to as a backside bus. In other embodiments, the mainprocessor 1121 communicates with cache memory 1140 using the system bus1150. Cache memory 1140 typically has a faster response time than mainmemory 1122 and is provided by, for example, SRAM, BSRAM, or EDRAM. Inthe embodiment shown in FIG. 11C, the processor 1121 communicates withvarious I/O devices 1130 via a local system bus 1150. Various buses maybe used to connect the central processing unit 1121 to any of the I/Odevices 1130, for example, a VESA VL bus, an ISA bus, an EISA bus, aMicroChannel Architecture (MCA) bus, a PCI bus, a PCI-X bus, aPCI-Express bus, or a NuBus. For embodiments in which the I/O device isa video display 1124, the processor 1121 may use an Advanced GraphicsPort (AGP) to communicate with the display 1124. FIG. 11C depicts anembodiment of a computer 1100 in which the main processor 1121 maycommunicate directly with I/O device 1130 b, for example viaHYPERTRANSPORT, RAPIDIO, or INFINIBAND communications technology. FIG.11C also depicts an embodiment in which local busses and directcommunication are mixed: the processor 1121 communicates with I/O device1130 a using a local interconnect bus while communicating with I/Odevice 1130 b directly.

A wide variety of I/O devices 1130 a-1130 n may be present in thecomputing device 1100. Input devices include keyboards, mice, trackpads,trackballs, microphones, dials, touch pads, touch screen, and drawingtablets. Output devices include video displays, speakers, inkjetprinters, laser printers, projectors and dye-sublimation printers. TheI/O devices may be controlled by an I/O controller 1123 as shown in FIG.11B. The I/O controller may control one or more I/O devices such as akeyboard 1126 and a pointing device 1127, e.g., a mouse or optical pen.Furthermore, an I/O device may also provide storage and/or aninstallation medium 1116 for the computing device 1100. In still otherembodiments, the computing device 1100 may provide USB connections (notshown) to receive handheld USB storage devices such as the USB FlashDrive line of devices manufactured by Twintech Industry, Inc. of LosAlamitos, Calif.

Referring again to FIG. 11B, the computing device 1100 may support anysuitable installation device 1116, such as a disk drive, a CD-ROM drive,a CD-R/RW drive, a DVD-ROM drive, a flash memory drive, tape drives ofvarious formats, USB device, hard-drive, a network interface, or anyother device suitable for installing software and programs. Thecomputing device 1100 may further include a storage device, such as oneor more hard disk drives or redundant arrays of independent disks, forstoring an operating system and other related software, and for storingapplication software programs such as any program or software 1120 forimplementing (e.g., configured and/or designed for) the systems andmethods described herein. Optionally, any of the installation devices1116 could also be used as the storage device. Additionally, theoperating system and the software can be run from a bootable medium.

Furthermore, the computing device 1100 may include a network interface1118 to interface to the network 1104 through a variety of connectionsincluding, but not limited to, standard telephone lines, LAN or WANlinks (e.g., 802.11, T1, T3, 56 kb, X.25, SNA, DECNET), broadbandconnections (e.g., ISDN, Frame Relay, ATM, Gigabit Ethernet,Ethernet-over-SONET), wireless connections, or some combination of anyor all of the above. Connections can be established using a variety ofcommunication protocols (e.g., TCP/IP, IPX, SPX, NetBIOS, Ethernet,ARCNET, SONET, SDH, Fiber Distributed Data Interface (FDDI), RS232, IEEE802.11, IEEE 802.11a, IEEE 802.11b, IEEE 802.11g, IEEE 802.11n, IEEE802.11ac, IEEE 802.11ad, CDMA, GSM, WiMax and direct asynchronousconnections). In one embodiment, the computing device 1100 communicateswith other computing devices 1100′ via any type and/or form of gatewayor tunneling protocol such as Secure Socket Layer (SSL) or TransportLayer Security (TLS). The network interface 1118 may include a built-innetwork adapter, network interface card, PCMCIA network card, card busnetwork adapter, wireless network adapter, USB network adapter, modem orany other device suitable for interfacing the computing device 1100 toany type of network capable of communication and performing theoperations described herein.

In some embodiments, the computing device 1100 may include or beconnected to one or more display devices 1124 a-1124 n. As such, any ofthe I/O devices 1130 a-1130 n and/or the I/O controller 1123 may includeany type and/or form of suitable hardware, software, or combination ofhardware and software to support, enable or provide for the connectionand use of the display device(s) 1124 a-1124 n by the computing device1100. For example, the computing device 1100 may include any type and/orform of video adapter, video card, driver, and/or library to interface,communicate, connect or otherwise use the display device(s) 1124 a-1124n. In one embodiment, a video adapter may include multiple connectors tointerface to the display device(s) 1124 a-1124 n. In other embodiments,the computing device 1100 may include multiple video adapters, with eachvideo adapter connected to the display device(s) 1124 a-1124 n. In someembodiments, any portion of the operating system of the computing device1100 may be configured for using multiple displays 1124 a-1124 n. Oneordinarily skilled in the art will recognize and appreciate the variousways and embodiments that a computing device 1100 may be configured tohave one or more display devices 1124 a-1124 n.

In further embodiments, an I/O device 1130 may be a bridge between thesystem bus 1150 and an external communication bus, such as a USB bus, anApple Desktop Bus, an RS-232 serial connection, a SCSI bus, a FireWirebus, a FireWire 800 bus, an Ethernet bus, an AppleTalk bus, a GigabitEthernet bus, an Asynchronous Transfer Mode bus, a FibreChannel bus, aSerial Attached small computer system interface bus, a USB connection,or a HDMI bus.

A computing device 1100 of the sort depicted in FIGS. 11B and 11C mayoperate under the control of an operating system, which controlscheduling of tasks and access to system resources. The computing device1100 can be running any operating system such as any of the versions ofthe MICROSOFT WINDOWS operating systems, the different releases of theUnix and Linux operating systems, any version of the MAC OS forMacintosh computers, any embedded operating system, any real-timeoperating system, any open source operating system, any proprietaryoperating system, any operating systems for mobile computing devices, orany other operating system capable of running on the computing deviceand performing the operations described herein. Typical operatingsystems include, but are not limited to: Android, produced by GoogleInc.; WINDOWS 7 and 8, produced by Microsoft Corporation of Redmond,Wash.; MAC OS, produced by Apple Computer of Cupertino, Calif.; WebOS,produced by Research In Motion (RIM); OS/2, produced by InternationalBusiness Machines of Armonk, N.Y.; and Linux, a freely-availableoperating system distributed by Caldera Corp. of Salt Lake City, Utah,or any type and/or form of a Unix operating system, among others.

The computer system 1100 can be any workstation, telephone, desktopcomputer, laptop or notebook computer, server, handheld computer, mobiletelephone or other portable telecommunications device, media playingdevice, a gaming system, mobile computing device, or any other typeand/or form of computing, telecommunications or media device that iscapable of communication. The computer system 1100 has sufficientprocessor power and memory capacity to perform the operations describedherein.

In some embodiments, the computing device 1100 may have differentprocessors, operating systems, and input devices consistent with thedevice. For example, in one embodiment, the computing device 1100 is asmart phone, mobile device, tablet or personal digital assistant. Instill other embodiments, the computing device 1100 is an Android-basedmobile device, an iPhone smart phone manufactured by Apple Computer ofCupertino, Calif., or a Blackberry or WebOS-based handheld device orsmart phone, such as the devices manufactured by Research In MotionLimited. Moreover, the computing device 1100 can be any workstation,desktop computer, laptop or notebook computer, server, handheldcomputer, mobile telephone, any other computer, or other form ofcomputing or telecommunications device that is capable of communicationand that has sufficient processor power and memory capacity to performthe operations described herein.

Although the disclosure may reference one or more “users”, such “users”may refer to user-associated devices or stations (STAs), for example,consistent with the terms “user” and “multi-user” typically used in thecontext of a multi-user multiple-input and multiple-output (MU-MIMO)environment.

Although examples of communications systems described above may includedevices and APs operating according to an 802.11 standard, it should beunderstood that embodiments of the systems and methods described canoperate according to other standards and use wireless communicationsdevices other than devices configured as devices and APs. For example,multiple-unit communication interfaces associated with cellularnetworks, satellite communications, vehicle communication networks, andother non-802.11 wireless networks can utilize the systems and methodsdescribed herein to achieve improved overall capacity and/or linkquality without departing from the scope of the systems and methodsdescribed herein.

It should be noted that certain passages of this disclosure mayreference terms such as “first” and “second” in connection with devices,mode of operation, transmit chains, antennas, etc., for purposes ofidentifying or differentiating one from another or from others. Theseterms are not intended to merely relate entities (e.g., a first deviceand a second device) temporally or according to a sequence, although insome cases, these entities may include such a relationship. Nor do theseterms limit the number of possible entities (e.g., devices) that mayoperate within a system or environment.

It should be understood that the systems described above may providemultiple ones of any or each of those components and these componentsmay be provided on either a standalone machine or, in some embodiments,on multiple machines in a distributed system. In addition, the systemsand methods described above may be provided as one or morecomputer-readable programs or executable instructions embodied on or inone or more articles of manufacture. The article of manufacture may be afloppy disk, a hard disk, a CD-ROM, a flash memory card, a PROM, a RAM,a ROM, or a magnetic tape. In general, the computer-readable programsmay be implemented in any programming language, such as LISP, PERL, C,C++, C#, PROLOG, or in any byte code language such as JAVA. The softwareprograms or executable instructions may be stored on or in one or morearticles of manufacture as object code.

While the foregoing written description of the methods and systemsenables one of ordinary skill to make and use what is consideredpresently to be the best mode thereof, those of ordinary skill willunderstand and appreciate the existence of variations, combinations, andequivalents of the specific embodiment, method, and examples herein. Thepresent methods and systems should therefore not be limited by the abovedescribed embodiments, methods, and examples, but by all embodiments andmethods within the scope and spirit of the disclosure.

We claim:
 1. A method for reduced memory utilization for motion dataderivation in encoded video, comprising: determining, by a video decoderof a device from an input video bitstream, one or more control pointmotion vectors of a first prediction unit of a first coding tree unit,based on a plurality of motion vectors of one or more second predictionunits neighboring the first prediction unit stored in a motion data linebuffer of the device, the one or more second prediction units from asecond coding tree unit neighboring the first coding tree unit; anddecoding, by the video decoder, one or more sub-blocks of the firstprediction unit based on the determined one or more control point motionvectors.
 2. The method of claim 1, wherein the one or more secondprediction units are located at a top boundary of the first coding treeunit.
 3. The method of claim 2, wherein the plurality of motion vectorsof the one or more second prediction units are stored in the motion dataline buffer of the device during decoding of the first coding tree unit.4. The method of claim 3, further comprising deriving the one or morecontrol point motion vectors of the first prediction unit proportionalto an offset between a sample position of the first prediction unit anda sample position of the one or more second prediction units.
 5. Themethod of claim 3, further comprising: determining, by the videodecoder, one or more second control point motion vectors of anotherprediction unit of the first coding tree unit based on the one or morecontrol point motion vectors, responsive to one or more third predictionunits neighboring the another prediction unit not being located at thetop boundary of the first coding tree unit; and decoding, by the videodecoder, one or more sub-blocks of the another prediction unit based onthe determined one or more second control point motion vectors.
 6. Themethod of claim 1, wherein determining the one or more control pointmotion vectors further comprises calculating an offset from motion dataof the one or more second prediction units neighboring the firstprediction unit based on a height or width of the one or more secondprediction units.
 7. The method of claim 6, wherein an identification ofthe height or width of the one or more second prediction units is storedin an affine motion data line buffer.
 8. The method of claim 1, whereindecoding the one or more sub-blocks of the first prediction unit basedon the determined one or more control point motion vectors furthercomprises deriving sub-block motion data of the one or more sub-blocksbased on the determined one or more control point motion vectors.
 9. Themethod of claim 1, further comprising providing, by the video decoder toa display device, the decoded one or more sub-blocks of the firstprediction unit.
 10. A system for reduced memory utilization for motiondata derivation in encoded video, comprising: a motion data line buffer;and a video decoder, configured to: determine, from an input videobitstream, one or more control point motion vectors of a firstprediction unit of a first coding tree unit, based on a plurality ofmotion vectors of one or more second prediction units neighboring thefirst prediction unit stored in the motion data line buffer, the one ormore second prediction units from a second coding tree unit neighboringthe first coding tree unit, and decode one or more sub-blocks of thefirst prediction unit based on the determined one or more control pointmotion vectors.
 11. The system of claim 10, wherein the one or moresecond prediction units are located at a top boundary of the firstcoding tree unit.
 12. The system of claim 11, wherein the plurality ofmotion vectors of the one or more second prediction units are stored inthe motion data line buffer during decoding of the first coding treeunit.
 13. The system of claim 12, wherein the video decoder is furtherconfigured to derive the one or more control point motion vectors of thefirst prediction unit proportional to an offset between a sampleposition of the first prediction unit and a sample position of the oneor more second prediction units.
 14. The system of claim 12, wherein thevideo decoder is further configured to: determine one or more secondcontrol point motion vectors of another prediction unit of the firstcoding tree unit based on the one or more control point motion vectors,responsive to one or more third prediction units neighboring the anotherprediction unit not being located at the top boundary of the firstcoding tree unit, and decode one or more sub-blocks of the anotherprediction unit based on the determined one or more second control pointmotion vectors.
 15. The system of claim 10, wherein the video decoder isfurther configured to calculate an offset from motion data of the one ormore second prediction units neighboring the first prediction unit basedon a height or width of the one or more second prediction units.
 16. Thesystem of claim 15, further comprising an affine motion data line bufferconfigured to store an identification of the height or width of the oneor more second prediction units.
 17. The system of claim 10, wherein thevideo decoder is further configured to derive sub-block motion data ofthe one or more sub-blocks based on the determined one or more controlpoint motion vectors.
 18. A method for reduced memory utilization formotion data derivation in encoded video, comprising: for a firstprediction unit of a first coding tree unit of an input video bitstream,determining, by a video decoder of a device, whether one or more secondprediction units neighboring the first prediction unit are located at atop boundary of the first coding tree unit, the one or more secondprediction units stored in a motion data line buffer of the device;either: (i) responsive to the one or more second prediction units beinglocated at the top boundary of the first coding tree unit, derivingmotion vectors for the first prediction unit from motion vectors of theone or more second prediction units, or (ii) responsive to the one ormore second prediction units not being located at the top boundary ofthe first coding tree unit, deriving motion vectors for the firstprediction unit from one or more control point motion vectors of the oneor more second prediction units; and decoding, by the video decoder, oneor more sub-blocks of the first prediction unit based on the derivedmotion vectors.